-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perf[BMQ,MQB]: stable GC history preventing allocations #498
base: main
Are you sure you want to change the base?
Conversation
992dd4c
to
adcdc17
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build 354 of commit adcdc17 has completed with FAILURE
adcdc17
to
310abe9
Compare
Signed-off-by: Evgeny Malygin <[email protected]>
310abe9
to
1196b50
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build 355 of commit 1196b50 has completed with FAILURE
Before, performing gc on history was always sort of best-effort and done in the background. I think I agree on approach 1 not being ideal, it doesn't get at the core of the problem which is mismatched pacing of PUTs and gc; gc is still too periodic. Approach 2 is interesting, I hadn't considered moving gc into the critical path. With approach 2, if (and only if) background gc can't keep history limited to the configured time, the broker starts doing gc on the critical data path when processing incoming PUTs. When doing that gc, I'm curious if you've got any measurements on the gc time, do we have a rough number for how long that takes? I suspect it's pretty cheap. It'd be interesting to see if a smaller batch of gc on PUTs helps (in terms of latency maybe?) at all. |
Actually, we do
That is correct. However, there should also be some overhead of calling
Yes, I think it is cheap, but I will also add benchmarks. |
Problem overview
Consider a cluster processing 100k PUT msgs/s on its limit, meaning, that it can barely process this amount of PUTs without NACKs and producers generate exactly this rate of messages.
Consider a rollover or any other lag causing Queue dispatcher thread to stuck for 100ms.
After this time, there will be 10k pending PUT messages waiting in QUEUE dispatcher queue:
[Put1]...[Put10000]
Now, consider a GC called before this block of PUTs. It cleared something in queue history, but not everything, and also we scheduled another GC call to the QUEUE dispatcher:
[Put1]...[Put10000][gcHistory]
To call this
gcHistory
, we need to process all PUTs before (they are earlier in the queue).So we process 10k PUTs and add them to queue history. Since in this scenario producers generate 100k msgs/s and cluster is also able to process 100k msgs/s and no more, by the time we process Put1-Put10000, there will be another 10k PUTs in the queue:
[gcHistory][Put10001]...[Put20000]
So when we call
gcHistory
, we will remove at most 1k msgs from the history (k_GC_MESSAGES_BATCH_SIZE = 1000
). But at the same time, we will add 10k new items to queue history and get another 10k PUTs to the QUEUE dispatcher queue, delaying the nextgcHistory
.And this process will continue. So we will constantly allocate new items in queue history items pool, and we will eventually upsize the number of buckets in ordered hashmap, causing performance lag and full rehashing of hashtable.
20kk items in queue history after a 5-minute run on 80k msgs/s:
3GB allocated for queue history (Handles):
Alternative 1
Set
k_GC_MESSAGES_BATCH_SIZE = 1000000
:Alternative 2
insert
to history is the only operation that can increase history size, and, if we know that GC required, we can enforce GC on anyinsert
operation. So the workflow is:gcHistory
and find out that we cleaned1000
items from history, but there are more. We setd_requireGC
flag for history.insert
to history on PUT.insert
checks ford_requireGC
flag and execsgc
for another batch of items.gc
, when there are no items togc
. We unsetd_requireGC
flag, so the following PUTs do not cause GC.d_requireGC
again.So we pay some CPU time on PUT operations to do the GC, but at the same time:
gcHistory
callback to the same dispatcher queue.Comparison
80k msgs/s priority stress test shows the same results as 90k msgs/s with the proposed change. This means, that with this change we can actually hold better on higher throughputs.